Jupyter environment

The Jupyter is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, machine learning and much more.

Website Project repository

We will be using a Jupyter server as the primary web interface for this workshop. Several notebooks have been provided to you, in advance, to guide you through the workshop. After the workshop, you may use the Agave Jupyter image to recreate the notebook server and repeat the workshop, or continue on with your own work at your leisure.

The Agave image has several customizations to facilitate use of the platform and ease much of the heavy lifting done behind the scenes in this tutorial.

Custom Kernels

Your Jupyter server has multiple kernels available for use right away. We have preconfigured them with several useful libraries and tools to help users get up and running with common tasks easier. Additionally, we have bundled in Agave CLI and Python SDK into the Bash, Python 2, and Python 3 kernels respectively. Both kernels are pre-authenticated with valid Agave auth tokens that you can use to begin interacting with the Agave Platform right away.

Shared file system

Your home directory on the Jupyter server is shared with your sandbox, so you can safely copy data between the two environments quickly and easily.

Web console

Jupyter contains a web terminal that can be used to access your sandbox environment or interact with the Jupyter container itself. To login to your sandbox from the Jupyter web terminal, simply run the following command:

ssh -p 10022 $VM_IPADDRESS

Tutorial notebooks

This tutorial is presented as a series of Jupyter notebooks. If you are attending this tutorial in person, you will download the notebooks into the home directory of your notebook server. If you are following along after the fact, you should download the notebooks from the github repository into your Jupyter workspace.

git clone --depth 1 https://github.com/agaveplatform/SC17-container-tutorial.git

API access

The tutorial walks you through the process of obtaining a set of API keys an authenticating to the Agave Platform. Once this is done, you no longer need to authenticate to follow the tutorial. Both the Agave CLI and Python SDK will be picked up your authorization cache and automatically refresh it as needed.

Extras

Inside of the examples directory, you will find several notebooks to help you learn more about the Agave platform, containers, and SciOps. We leave these for you to follow after the tutorial.


Sandbox environment

The tutorial sandbox is a full Ubuntu 16.04 server running as a Docker container on a VM dedicated for your use in this tutorial. The sandbox has a standard HPC build environment with OpenMPI, Python 2, Python 3, build-essential, gfortran, openssl, git, jq, vim, and a host of other utilities.

Container runtimes

Docker and Singularity are both pre-installed in your Sandbox. All images used in this tutorial are available from the public Agave Docker Hub and Singularity Hub accounts. You may also use your own private registry accounts. You will need to login to the respective registries on your own.

Funwave example code

The sample code for this project is already present in $HOME/FUNWAVE-TVD.

Shared file system

Your $HOME/work directory on the Jupyter server is shared with your sandbox, so you can safely copy data between the two environments quickly and easily.

Accessibility

To login to the sandbox from outside the Jupyter server, use the host IP address. You will find the public IP address of your sandbox in the $VM_IPADDRESS environment variable. Valid ssh keys are available in the ~/.ssh director of your Jupyter server. Alternatively, you can append your own public key to the $HOME/.ssh/authorized_keys file.

ssh -i /path/to/private/key.pem -p 10022 jovyan@$VM_IPADDRESS

Persistence

Your VM will remain available for 1-2 days following the tutorial. During that time, your data will remain available. After that, the VM an any data saved with it will be destroyed. If you need to persist your data, it is recommended that you move it to another host, or create your own account in the Agave public tenant and save your data in the free cloud storage provied to you by default there.


Logging In

We have already configured resources for you to use in this tutorial.

Virtual Machine

Each of you have a dedicated VM provided by the Nectar Cloud. You will use this VM for the duration of the tutorial.

Training Account

A training account on the Agave Platform's public tenant has also been allocated to you.

Login

Your Jupyter server is available at <username>.sc17.training.agaveplatform.org.

Usernames will be training001 to training100. We will count off to determine our instance.

When you first login, you will find it empty, save for a notebook named INSTALL.ipynb". Open this notebook by clicking on the notebook name, then click the "run" button. This will fetch all the tutorial notebooks from the tutorial's git repository an add them to your workspace.

Once complete, open the Config notebook to being the meat of our tutorial.


Following along at home

If you are following along with this tutorial at home, you can recreate the tutorial Jupyter server and sandbox environments by running the containers on your own server using the following Docker Compose file (i.e. save the file below in a file named docker-compose.yml).

version: '2'

volumes:
  training-volume:

services:
  jupyter:
    image: agaveplatform/jupyter-notebook:latest
    command: start-notebook.sh --NotebookApp.token=''
    mem_limit: 2048m
    ports:
      - '8888:8005'
    environment:
      - VM_MACHINE=training-node-${AGAVE_USERNAME}
      - VM_HOSTNAME=localhost:8888
      - USE_TUNNEL=True
      - ENVIRONMENT=training
      - SCRATCH_DIR=/home/jovyan
      - MACHINE_USERNAME=jovyan
      - MACHINE_NAME=sandbox
      - DOCKERHUB_NAME=stevenrbrandt
      - AGAVE_APP_DEPLOYMENT_PATH=agave-deployment
      - AGAVE_CACHE_DIR=/home/jovyan/work/.agave
      - AGAVE_JSON_PARSER=jq
      - AGAVE_USERNAME=${AGAVE_USERNAME}
      - AGAVE_PASSWORD=${AGAVE_PASSWORD}
      - AGAVE_SYSTEM_SITE_DOMAIN=localhost
      - AGAVE_STORAGE_WORK_DIR=/home/jovyan
      - AGAVE_STORAGE_HOME_DIR=/home/jovyan
      - AGAVE_APP_NAME=funwave-tvd-sc17-${AGAVE_USERNAME}
      - AGAVE_STORAGE_SYSTEM_ID=nectar-storage-${AGAVE_USERNAME}
      - AGAVE_EXECUTION_SYSTEM_ID=nectar-exec${AGAVE_USERNAME}
    volumes:
      - training-volume:/home/jovyan/work
      - ../notebooks:/home/jovyan/notebooks
  sandbox:
    image: agaveplatform/sc17-sandbox:latest
    mem_limit: 2048m
    privileged: True
    ports:
      - '10022:22'
    environment:
      - VM_MACHINE=training-node-${AGAVE_USERNAME}
      - NGROK_TOKEN=${NGROK_TOKEN}
      - USE_TUNNEL=True
      - ENVIRONMENT=training
      - AGAVE_CACHE_DIR=/home/jovyan/work/.agave
    volumes:
      - training-volume:/home/jovyan/work
      - /var/run/docker.sock:/var/run/docker.sock
      - $HOME/.docker:/home/jovyan/.docker:ro

To run the above, you need to first set the environment variables AGAVE_USERNAME, AGAVE_PASSWORD, and NGROK_TOKEN. The first two should be your agave username and password as obtained from Agave TOGO. The ngrok token should be obtained from ngrok

Ngrok will provide tunnelling for you so that agave can ssh into your laptop or desktop machine. It will do this by setting the M_IPADDRESS, VM_HOSTNAME and VM_SSH_PORT for you.

Once you have these things setup, you should be able to run docker-compose up (note: you should run this command from the same directory in which you created your docker-compose.yml file) you should then be able use your brower to connect to the tutorial setup on port 8888 of your local machine (http://localhost:8888).